Optimal prediction of the number of unseen species.

نویسندگان

  • Alon Orlitsky
  • Ananda Theertha Suresh
  • Yihong Wu
چکیده

Estimating the number of unseen species is an important problem in many scientific endeavors. Its most popular formulation, introduced by Fisher et al. [Fisher RA, Corbet AS, Williams CB (1943) J Animal Ecol 12(1):42-58], uses n samples to predict the number U of hitherto unseen species that would be observed if [Formula: see text] new samples were collected. Of considerable interest is the largest ratio t between the number of new and existing samples for which U can be accurately predicted. In seminal works, Good and Toulmin [Good I, Toulmin G (1956) Biometrika 43(102):45-63] constructed an intriguing estimator that predicts U for all [Formula: see text] Subsequently, Efron and Thisted [Efron B, Thisted R (1976) Biometrika 63(3):435-447] proposed a modification that empirically predicts U even for some [Formula: see text], but without provable guarantees. We derive a class of estimators that provably predict U all of the way up to [Formula: see text] We also show that this range is the best possible and that the estimator's mean-square error is near optimal for any t Our approach yields a provable guarantee for the Efron-Thisted estimator and, in addition, a variant with stronger theoretical and experimental performance than existing methodologies on a variety of synthetic and real datasets. The estimators are simple, linear, computationally efficient, and scalable to massive datasets. Their performance guarantees hold uniformly for all distributions, and apply to all four standard sampling models commonly used across various scientific disciplines: multinomial, Poisson, hypergeometric, and Bernoulli product.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of potential habitat distribution of Artemisia sieberi Besser using data-driven methods in Poshtkouh rangelands of Yazd province

The present study aimed to model potential habitat distribution of A. sieberi, and its ecological requirements using generalized additive model (GAM) and classification and regression tree (CART) in in the Poshtkouh rangelands of Yazd province. For this purpose, pure habitats of the species was delineated and the species presence data was recorded by the systematic-randomize sampling method. Us...

متن کامل

برآورد حدود پراکنش مکانی گونه‌های گیاهی با روش شبکۀ عصبی‌مصنوعی در مراتع غرب تفتان

This study aimed to estimate of spatial distribution scope of plant species and preparation of predictive distribution maps of plant species using Artificial Neural Network (ANN) in Taftan west rangelands of Khash city. To this end, vegetation sampling was carried out by random-systematic method after identification and separation of plant species habitats. In order to sample the soil at each h...

متن کامل

Surveying Introspection of Architecture of Jame` Mosque of Isfahan with Emphasis on Grounded Study of Unseen Concepts of Hafez' and Mulavi's Lyrics

There are close relationships between hidden structures of mosques and unseen concepts embodied in Persian language and literature of Iran that show that construction of famous mosques in Iran, especially in Isfahan Style are immortal and timeless.  A question arises in this context as to what factors have led to the manifestation of unseen concepts in the architecture of Isfahan mosques object...

متن کامل

A Critique of the View Claiming Conflict in the Verses of the Knowledge of the Unseen

The claim of conflict in the verses of the knowledge of the unseen in Quran is one of those made by Brasher – the Jewish orientalist. He believes that the verses which consider the knowledge of the unseen to be only specific to God are in conflict with those verses referring apparently to the Prophet (p.b.u.h) and some of the divine selected people's awareness of the unseen. Classifying the ver...

متن کامل

New Optimal Observer Design Based on State Prediction for a Class of Non-linear Systems Through Approximation

This paper deals with the optimal state observer of non-linear systems based on a new strategy. Despite the development of state prediction in linear systems, state prediction for non-linear systems is still challenging. In this paper, to obtain a future estimation of the system states, initially Taylor series expansion of states in their receding horizons was achieved to any specified order an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 113 47  شماره 

صفحات  -

تاریخ انتشار 2016